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VARIANTS OF ALTERNATIVE SPLICING 



FIELD OF THE INVENTION 

The present invention concerns novel nucleic acid sequences, vectors and 
host cells containing them, amino acid sequences encoded by said sequences, and 
antibodies reactive with said amino acid sequences, as well as pharmaceutical 
compositions comprising any of the above. The present invention further concerns 
methods for screening for candidate activator or deactivators utilizing said amino 
acid sequences. 

BACKGROUND OF THE INVENTION 

Alternative splicing (AS) is an important regulatory mechanism in higher 
eukaryotes (P.A. Sharp, Cell 77, 805-8152 (1994). It is thought to be one of the 
important mechanisms for differential expression related to tissue or 
development stage specificity. It is known to play a major role in numerous 
biological systems, including human antibody responses, sex determination in 
Drosophila, and (S. Stamm, M.Q. Zhang, T.G. Marr and D.M. Helfinan, Nucleic 
Acids Research 22, 1515-1526 (1994); B. Chabot, Trends Genet. 12, 472-478 
(1996); R.E. Breitbart, A. Andreadis, B. Nadal-Ginard, Annual Rev. Biochem., 
56, 467-495 (1987); C.W. Smith, J.G. Patton, B. Nadal-Ginard, Annu. Rev. 
Genet., 27, 527-577 (1989). 

Until recently it was commonly believed that alternative splicing existed 
in only a small fraction of genes (about 5%). A recent observation based on 
literature survey of known genes revises this estimate to as high as stating that at 
least 30% of human genes are alternatively spliced (M.S. Gelfand, I. Dubchak, L 
Draluk and M. Zorn, Nucleic Acids Research 27, 301-302 (1999). The 
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importance of the actual frequency of this phenomenon lies not only in the direct 
impact on the number of proteins created (100,000 human genes, for example 
would be translated to a much higher number of proteins), but also in the' 
diversity of functionality derived from the process. 

Several mechanisms at different stages may be held responsible for the 
complexity of higher eukaryote which include: alternative splicing at the 
transcription level, RNA editing at the post-transcriptional level, and 
post-translational modifications are the ones characterized to date. 

10 GLOSSARY 

In the following description and claims use will be made, at times, with a 
variety of terms, and the meaning of such terms as they should be construed in 
accordance with the invention is as follows: 

15 "Variant nucleic acid sequence" - the sequence shown in any one of the 
sequences denoted NV_1 to NV.48611, which are listed in the attached CD-ROM 
marked 'New Variants October 2000" (hereinafter "CD-ROM") sequences having 
at least 90% Entity (see below) to said sequence ^fragments (see below) of the 
above sequences of least 20 b.p. long. The sequences are divided in 43 files 
20 according to their functional groups as will be explained hereinbelow For 
convenience sake NV_1 to NV.48611 will be denoted SEQ ID NO: 1 to SEQ ID 
NO:48611, respectively in the following description. These sequences are 
sequences coding for novel, naturally occurring, alternative splice variants of native 
and known genes. It should be emphasized that the novel variants of the present 
25 invention are naturally occurring sequences resulting from alternative splicing of 
genes and not merely truncated, mutated or fragmented forms of known sequences 
Thus the alternative splice variants of the invention have physiological significance 
as regards where, in what tissues, when, at which developmental stage and under 
which conditions (such as diseases, etc.) their expression is modulated, i.e., ceased, 
30 increased, up-regulated or down-regulated. 
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"Variant product - also referred at times as the "variant protein" or "variant 
polypeptide" - is an amino acid sequence encoded by the variant nucleic acid 
sequence which is a naturally occurring mRNA sequence obtained as a result of 
alternative splicing. The amino acid sequence may be a peptide, a protein, as well 
5 as peptides or proteins having chemically modified amino acids (see below) such as 
a glycopeptide or glycoprotein. The term also includes homologues (see below) of 
said sequences in which one or more amino acids has been added, deleted, 
substituted (see below) or chemically modified (see below) as well zs fragments 
(see below) of this sequence having at least 10 amino acids. More specifically, it 
io concerns the amino acid sequences present in the above CD-ROM. Each amino 
acid sequences has the same NV_... number as the nucleic acid sequence from 
which it was coded. The directory name of the file is the functional family name. 
The nucleic sequences file name is (family-name)Jbrjatent; and the protein 
sequences file is (family-name)_pep_p a tent. They are also denoted NV_1 to 
15 NV-48611,each protein having the same NV_... number as the nucleic acid 
sequence from which it was coded. 

"Nucleic acid sequence" - a sequence composed of DNA nucleotides, RNA 
nucleotides or a combination of both types and may include natural nucleotides, 
20 chemically modified nucleotides and synthetic nucleotides. 

"Amino acid sequence" - a sequence composed of any one of the 20 naturally 
appearing amino acids, amino acids which have been chemically modified (see 
below), or composed of synthetic amino acids. 



25 



"Fragment of variant nucleic acid sequence" - novel short stretch of nucleic 
acid sequences of at least 20 b.p., which does not appear as a continuous stretch 
in the original nucleic acid sequence (see below). The fragment may be a 
sequence which was previously undescribed in the context of the published RNA 
30 and which affects the amino acid sequence encoded by the known gene. For 



-4- 



example, where the variant nucleic includes a sequence which was not included 
m the original sequence (a sequence which was an intron in the original 
sequence) the fragment includes that additional sequence. The fragment may also 
be a region which is not an intron, which was not present in the original 
5 sequence. Another example is when the variant lacks a non-terminal region 
which was present in the original sequence. The two stretches of nucleotides 
spanning this region (upstream and downstream of this region) are brought 
together by splicing in the variant, but are spaced from each by that region in the 
original sequence and are thus not continuous. A continuous stretch of nucleic 
10 acids comprising said two stretches of nucleotides, is not present in the original 
sequence and they are spaced at present in the variant and thus fall under the 
definition of fragment. 



"Fragments of variant products" - novel amino acid sequences coded by the 
15 "fragment of variant nucleic acid sequence " defined above. 

"Homologues of variants" - amino acid sequences of variants in which one or 
more amino acids has been added, deleted or replaced. The addition, deletion or 
replacement should be in regions or adjacent to regions where the variant differs 
20 from the original sequence (see below). 

"Conservative substitution" - refers to the substitution of an amino acid in one 
class by an amino acid of the same class, where a class is defined by common 
Physicochemical amino acid side chain properties and high substitution 
25 frequencies in homologous proteins found in nature, as determined, for example, 
by a standard Dayhoff frequency exchange matrix or BLOSUM matrix. [Six 
general classes of amino acid side chains have been categorized and include- 
Class I (Cys); Class II (Ser, Thr, Pro, Ala, Gly); Class III (Asn, Asp, Gin, Glu); 
Class IV (His, Arg, Lys); Class V (He, Leu, Val, Met); and Class VI (Phe, Tyr, 
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Tip). For example, substitution of an Asp for another class III residue such as 
Asn, Gin, or Glu, is a conservative substitution. 

"Non-conservative substitution- - refers to the substitution of an amino acid in 
5 one class with an amino acid from another class; for example, substitution of an 
Ala, a class II residue, with a class III residue such as Asp, Asn, Glu, or Gin. 

"Chemically modified" - when referring to the product of the invention, means a 
product (protein) where at least one of its amino acid resides is modified either 

10 by natural processes, such as processing or other post-trans lational modifications, 
or by chemical modification techniques which are well known in the art. Among 
the numerous known modifications typical, but not exclusive examples include: 
acetylation, acylation, amidation, ADP-ribosylation, glycosylate, GPI anchor 
formation, covalent attachment of a lipid or lipid derivative, methylation, 

15 myristlyation, pegylation, prenylation, phosphorylation, ubiqutination, or any 
similar process. 

"Biologically active" - refers to the variant product having some sort of 
biological activity, for example, some physiologically measurable effect on target 
20 cells, molecules or tissues. 

"Immunologically active" defines the capability of a natural, recombinant or 
synthetic varient product, or any fragment thereof, to induce a specific immune 
response in appropriate animals or cells and to bind with specific antibodies. 
25 Thus, for example, an immunologically active fragment of variant product 
denotes a fragment which retains some or all of the immunological properties of 
the variant product, e.g can bind specific anti-variant product antibodies or which 
can elicit an immune response which will generate such antibodies or cause 
proliferation of specific immune cells which produce variant. 
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" Optimal alignment" - is defined as an alignment giving the highest percent 
identity score. Such alignment can be performed using a variety of commercially 
available sequence analysis programs, such as the local alignment program 
LALIGN using a ktup of 1, default parameters and the default PAM. A preferred 
5 alignment is the one performed using the CLUSTAL-W program from 
Mac Vector (TM), operated with an open gap penalty of 10.0, an extended gap 
penalty of 0. 1, and a BLOSUM similarity matrix. If a gap needs to be inserted 
into a first sequence to optimally align it with a second sequence, the percent 
identity is calculated using only the residues that are paired with a corresponding 

io amino acid residue (i.e., the calculation does not consider residues in the second 
sequences that are in the "gap" of the first sequence). In case of alignments of 
known gene sequences with that of the new variant, the optimal alignment 
invariably included aligning the identical parts of both sequences together, then 
keeping apart and unaligned the sections of the sequences that differ one from the 

15 other. 

"Having at least 90% identity" - with respect to two amino acid or nucleic acid 
sequence sequences, refers to the percentage of residues that are identical in the 
two sequences when the sequences are optimally aligned. Thus, 90% amino acid 
20 sequence identity means that 90% of the amino acids in two or more optimally 
aligned polypeptide sequences are identical, however this definition explicitly 
excludes sequences which are 100% identical with the original sequence from 
which the variant of the invention was varied. 

25 "Isolated nucleic acid molecule having an variant nucleic acid sequence " - is a 
nucleic acid molecule that includes the coding variant nucleic acid sequence. 
Said isolated nucleic acid molecule may include the variant nucleic acid 
sequence as an independent insert; may include the variant nucleic acid sequence 
fused to an additional coding sequences, encoding together a fusion protein in 

30 which the variant coding sequence is the dominant coding sequence (for 



example, the additional coding sequence may code for a signal peptide); the 
variant nucleic acid sequence may be in combination with non-coding sequences, 
e.g., introns or control elements, such as promoter and terminator elements or 5' 
and/or 3' untranslated regions, effective for expression of the coding sequence in 
a suitable host; or may be a vector in which the variant protein coding sequence 
is a heterologous. 

"Expression vector" - refers to vectors that have the ability to incorporate and 
express heterologous DNA fragments in a foreign cell. Many prokaryotic and 
eukaryotic expression vectors are known and/or commercially available. 
Selection of appropriate expression vectors is within the knowledge of those 
having skill in the art. 

"Deletion" - is a change in either nucleotide or amino acid sequence in which 
one or more nucleotides or amino acid residues, respectively, are absent. 

"Insertion" or "addition" - is that change in a nucleotide or amino acid 
sequence which has resulted in the addition of one or more nucleotides or amino 
acid residues, respectively, as compared to the naturally occurring sequence. 

"Substitution" - replacement of one or more nucleotides or amino acids by 
different nucleotides or amino acids, respectively. As regards amino acid 
sequences the substitution may be conservative or non- conservative. 

"Antibody" - refers to IgG, IgM, IgD, IgA, and IgG antibody. The definition 
includes polyclonal antibodies or monoclonal antibodies. This term refers to 
whole antibodies or fragments of the antibodies comprising the antigen-binding 
domain of the anti-variant product antibodies, e.g. antibodies without the Fc 
portion, single chain antibodies, fragments consisting of essentially only the 
variable, antigen-binding domain of the antibody, etc. 
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" Treating a disease" - refers to administering a therapeutic substance effective 
to ameliorate symptoms associated with a disease, to lessen the severity or cure 
the disease, or to prevent the disease from occurring. 

5 

"Detection" - refers to a method of detection of a disease, disorder, pathological 
or normal condition. This term may refer to detection of a predisposition to a 
disease as well as for establishing the prognosis of the patient by determining the 
severity of the disease. 

10 

"Probe" - the variant nucleic acid sequence, or a sequence complementary 
therewith, when used to detect presence of other similar sequences in a sample. 
The detection is carried out by identification of hybridization complexes between 
the probe and the assayed sequence. The probe may be attached to a solid support 
15 or to a detectable label. 

"Original sequence" - the amino acid or nucleic acid sequence from which the 
variant of the invention have been varied as a result of alternative slicing. 

20 "Data carrier" - a medium for holding informational data which is in a 
computer readable form. It may be a magnetic or non-magnetic data carrier. 

SUMMARY OF THE INVENTION 

The present invention is based on the finding of novel, naturally occurring 
splice variants, which are naturally occurring sequences obtained by alternative 
25 splicing of known genes. The novel splice variants of the invention are not merely 
truncated forms, fragments or mutations of known genes, but rather novel 
sequences which naturally occur within the body of individuals. 

Each novel splice variant is a result of alternative splicing of an original 
sequence. One original sequence may have one or more splice variant sequences 
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derived therefrom by alternative splicing. The original sequence and hence the 
variants have been divided to 43 functional groups according to their biological 
activity as will be explained below. 

The nucleic acid sequence is present in one of sequences denoted NV 1 
5 (hereinafter "SEQ ID NO: J") to NV_48611 (hereinafter "SEQ ID NO: 48611") 
which are present in an attached CD-ROM marked "New Variants October 2000" 
listed in a group of 43 computer files: the nucleic acid sequences are listed under 
(functional group name)_ for_patent. The amino acid sequences are listed under 
(functional group name)_pep_patent. This CD-ROM forms an integral part of this 
10 disclosure, and will be denoted hereinafter simply as "CD-ROM". 

The term "alternative splicing" in the context of the present invention and 
claims refers to: intron inclusion, exon exclusion, addition or deletion of terminal 
sequences in the variant as compared to the original sequences, as well as to the 
possibility of "intron retention". Intron retention is an intermediate stage in the 
1 5 processing of RNA transcripts, where prior to production of fully processed mRNA 
the intron (naturally spliced in the original sequence) is retained in the variant. 
These intermediately processed RNAs may have physiological significance and are 
also within the scope of the invention. 

The novel variant products of the invention may have the same 
20 physiological activity as the original peptide from which they are varied (although 
perhaps at a different level); may have an opposite physiological activity from the 
activity featured by the original peptide from which they are varied; may have a 
completely different, unrelated activity to the activity of the original from which 
they are varied; or alternatively may have no activity at all and this may lead to 
25 various diseases or pathological conditions. Both in the case where the variant has 
the same activity as well as an opposite activity as the original sequence, it may 
differ from the original sequence in various properties not directly connected to its 
biological activity such as in its stability, its clearance rate, tissue and cellular 
localization, its temporal pattern of expression, mechanisms for its up or down 
30 regulations, responses to agonists or antagonists, etc. 
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The novel variants may serve for detection purposes, i.e. their presence or 
level may be indicative of a disease, disorder, pathological or normal condition or 
alternatively the ratio between the level variants and the level original peptide from 
which they were varied, or the ratio to other variants (derived from the same 
5 original sequence) may be indicative to a disease, disorder, pathological or normal 
condition. 

For example, for detectional purposes, it is possible to establish differential 
expression of various variants in various tissues. A certain variant may be 
expressed mainly in one tissue, while the original sequence from which it has been 

10 varied, or another variant derived from the same sequence, may be expressed 
mainly in another tissue. Understanding of the distribution of the variants in various 
tissues may be helpful in basic research, for understanding the physiological 
function of the genes as well as may help in targeting pharmaceuticals or 
developing pharmaceuticals. 

15 The study of the variants may also be helpful to distinguish various stages in 

the life cycles of cells which may also be helpful for development of 
pharmaceuticals for various pathological conditions in which cell cycles is 
non-normal, for example cancer. 

Thus the detection may by determination of the presence or the level of 

20 expression of the variant within a specific cell population, comprising determining 
said presence or level and comparing it between various cell types in a tissue, 
between different tissues and between individuals. 

Thus the present invention provides by its first aspect, a novel isolated 
nucleic acid molecule comprising or consisting of any one of the coding sequence 

25 SEQ ID NO: 1 to SEQ ID NO: 4861 1, fragments of said coding sequence having at 
least 20 nucleic acids (provided that said fragments are continuous stretches of 
nucleotides not present in the original sequence from which the variant was varied), 
or a molecule comprising a sequence having at least 90%, identity to SEQ ID 
NO: 1 to SEQ ID NO:48611, provided that the molecule is not completely identical 

30 to the original sequence from which the variant was varied. 



The present invention further provides a protein or polypeptide comprising 
or consisting of an amino acid sequence encoded by any of the above nucleic acid 
sequences, termed herein 'variant product", fragments of the above amino acid 
sequence having a length of at least 10 amino acids coded by the above fragments 
of the nucleic acid sequences, as well as homologues of the above amino acid 
sequences in which one or more of the amino acid residues has been substituted (by 
conservative or non-conservative substitution) added, deleted, or chemically 
modified. More specifically, the amino acid sequences are those present in the 
attached CD-ROM wherein each amino acid sequence has the same NV_... 
numbers as the nucleic acid sequence which codes for it. 

The deletions, insertions and modifications should be in regions, or adjacent 
to regions, wherein the variant differs from the original sequence. 

For example, where the variant is different from the original sequence by 
addition of a short stretch of 10 amino acids, in the terminal or non- terminal 
portion of the peptide, the invention also concerns homologues of that variant 
where the additional short stretch is altered for example, it includes only 8 
additional amino acids, includes 13 additional amino acids, or it includes 10 
additional amino acids, however some of them being conservative or 
non-conservative substitutes of the original additional 10 amino acids of the novel 
variants. In all cases the changes in the homolog, as compared to the original 
sequence, are in the same regions where the variant differs from the original 
sequence, or in regions adjacent to said region. 

Another example is where the variant lacks a non-terminal region (for 
example of 20 amino acids) which is present in the original sequence (due for 
example to exon exclusion). The homologues may lack in the same region only 17 
amino acids or 23 amino acids. Again the deletion is in the same region where the 
variant lacks a sequence as compared to the original sequence, or in a region 
adjacent thereto. 

It should be appreciated that once a man versed in the art's attention is 
directed to the importance of a specific region, due to the fact that this region differs 
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in the variant as compared to the original sequence, there is no problem in 
derivating said specific region by addition to it, deleting from it, or substituting 
some amino acids in it. Thus homologues of variants which are derivated from the 
variant by changes (deletion, addition, substitution) only in said region as well as in 
5 regions adjacent to it are also a part of the present invention. Generally, if the 
variant is distinguished from the original sequence by some sort of physiological 
activity, then the homolog is distinguished from the original sequence in essentially 
the same manner. 

The present invention further provides nucleic acid molecule comprising or 
10 consisting of a sequence which encodes the above amino acid sequences, (including 
the fragments and homologues of the amino acid sequences). Due to the 
degenerative nature of the genetic code, a plurality of alternative nucleic acid 
sequences, beyond those depicted in any one of SEQ ID NO:l to SEQ 
IDNO:48611, can code for the amino acid sequence of the invention. Those 
15 alternative nucleic acid sequences which code for the same amino acid sequences 
coded by the sequence SEQ ID NO:l to SEQ ID NO:48611 are also an aspect of 
the of the present invention. 

The present invention further provides expression vectors and cloning 
vectors comprising any of the above nucleic acid sequences, as well as host cells 
20 transfected by said vectors-. 

The present invention still further provides pharmaceutical compositions 
comprising, as an active ingredient, said nucleic acid molecules, said expression 
vectors, or said protein or polypeptide. 

These pharmaceutical compositions are suitable for the treatment of diseases 
25 and pathological conditions, which can be ameliorated, cured or prevented by 
raising the level of any one of the variant products of the invention. 

By a second aspect, the present invention provides a nucleic acid molecule 
comprising or consisting of a non-coding sequence which is complementary to that 
of any one of SEQ ID NO: 1 to SEQ ID NO:486 1 1 , or complementary to a sequence 
30 having at least 90% identity to said sequence (with the proviso added above) or a 
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fragment of said two sequences (according to the above definition of fragment). 
The complementary sequence may be a DNA sequence which hybridizes with any 
one of SEQ of ID NO:l to SEQ ID NO:48611 or hybridizes to a portion of that 
sequence having a length sufficient to inhibit the transcription of the 
5 complementary sequence. The complementary sequence may be a DNA sequence 
which can be transcribed into an mRNA being an antisense to the mRNA 
transcribed from any one of SEQ ID NO:l to SEQ ID NO:48611 or into an mRNA 
which is an antisense to a fragment of the mRNA transcribed from any one of SEQ 
ID NO:l to SEQ ID NO:48611 which has a length sufficient to hybridize with the 

10 mRNA transcribed from SEQ ID NO:l to SEQ ID NO:48611, so as to inhibit its 
translation. The complementary sequence may also be the mRNA or the fragment 
of the mRNA itself. 

The nucleic acids of the second aspect of the invention may be used for 
therapeutic or diagnostic applications for example as probes used for the detection 

15 of the variants of the invention. 

The presence of the variant transcript or the level of the variant transcript 
(identified either by any one of sequences 1 to 48611 or by a sequence 
complementary thereto) may be indicative of a multitude of diseases, disorders and 
various pathological as well as normal conditions. In addition, the ratio of the level 

20 of the transcripts of the variants of the invention may also be compared to that of 
the transcripts of the original sequences from which they were varied, or to the level 
of transcript of other variants, and said ratio may be indicative to a multitude of 
diseases, disorders and various pathological and normal conditions. 

The present invention also provides expression vectors comprising any one 

25 of the above defined complementary nucleic acid sequences and host cells 
transfected with said nucleic acid sequences or vectors, being complementary to 
those specified in the first aspect of the invention. 

The invention also provides anti-variant product antibodies, namely 
antibodies directed against the variant product which specifically bind to said 

30 variant product. Said antibodies are useful both for diagnostic and therapeutic 
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purposes. For example said antibodies may be as an active ingredient in a 
pharmaceutical composition as will be explained below. 

By another alternative, the invention concerns antibodies termed 
"distinguishing antibodies" which are directed solely to the amino acid sequences 
which distinguishes the variant from the original amino acid sequence from which 
it has been varied by alternative splicing. For example, where the variant contains 
15 additional amino acids as compared to the original sequence (due to intron 
inclusion) the antibodies may be directed against these additional amino acids 
(present in the variant and not present in the original sequence). Another example is 
where the variant lacks 20 amino acids as compared to the original sequence from 
which it is varied (for example due to exon exclusion). The distinguishing 
antibodies in that case may be directed only against these 20 amino acids which are 
present in the original sequence and absent from the variant sequence. 

The antibodies and the distinguishing antibodies may be used for detection 
purposes, i.e. to detect individuals, tissue, conditions (both pathological or 
physiological) wherein the variant sequence or original sequence are evident or 
abundant. The antibodies may also be used to distinguish conditions where the 
level, or ratio of the variant to original sequence is altered. 

The antibodies and the distinguishing antibodies may also be used for 
therapeutical purposes, i.e., to neutralize only the variant product or only the 
product of the original sequence, as the case may be, without neutralizing the other. 

The present invention also provides pharmaceutical compositions 
comprising, as an active ingredient, the nucleic acid molecules which comprise or 
consist of said complementary sequences, or of a vector comprising said 
complementary sequences. The pharmaceutical composition thus provides 
pharmaceutical compositions comprising, as an active ingredient, said anti-variant 
product antibodies. 

The pharmaceutical compositions comprising said anti-variant product 
antibodies or the nucleic acid molecule comprising said complementary sequence, 
are suitable for the treatment of diseases and pathological conditions where a 
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therapeutically beneficial effect may be achieved by neutralizing the variant (either 
at the transcript or product level) or decreasing the amount of the variant product or 
blocking its binding to its target, for example, by the neutralizing effect of the 
antibodies, or by the decrease of the effect of the antisense mRNA in decreasing 
5 expression level of the variant product. 

The variant products of the invention may also be used for screening of 
pharmaceuticals which interact only with the variant and not with the original 
sequence, or vice versa, thereby choosing or tailoring pharmaceuticals having better 
specificity either to tissues, specific conditions or better specificity to proteins 
10 expressed by a specific individual. 

According to the third aspect of the invention the present invention provides 
methods for detecting the level of the transcript (mRNA) of said variant product in 
a body fluid sample, or in a specific tissue sample, for example by use of probes 
comprising or consisting of said coding sequences; as well as methods for detecting 
15 levels of expression of said product in tissue, e.g. by the use of antibodies capable 
of specifically reacting with the variant products of the invention. Detection of the 
level of the expression of the variant of the invention in particular as compared to 
that of the original sequence from which it was varied or compared to other variant 
sequences all varied from the same original sequence may be indicative of a 
20 plurality of physiological or pathological conditions. 

The method, according to this latter aspect, for detection of a nucleic acid 
sequence which encodes the variant product in a biological sample, comprises the 
steps of: 

(a) providing a probe comprising at least one of the nucleic acid 
25 sequences defined above; 

(b) contacting the biological sample with said probe under conditions 
allowing hybridization of nucleic acid sequences thereby enabling formation of 
hybridization complexes; 
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(c) detecting hybridization complexes, wherein the presence of the 
complex indicates the presence of nucleic acid sequence encoding the variant 
product in the biological sample. 

The method as described above is qualitative, i.e. indicates whether the 
5 transcript is present in or absent from the sample. The method can also be 
quantitative, by determining the level of hybridization complexes and then 
calibrating said levels to determining levels of transcripts of the desired variant in 
the sample. 

Both qualitative and quantitative determination methods can be used for 
10 diagnostic, prognostic and therapy planning purposes. 

By a preferred embodiment the probe is part of a nucleic acid chip used for 
detection purposes, i.e. the probe is a part of an array of probes each present in a 
known location on a solid support. 

The nucleic acid sequence used in the above method may be a DNA 
15 sequence an RNA sequence, etc; it may be a coding or a sequence or a sequence 
complementary thereto (for respective detection of RNA transcripts or 
coding-DNA sequences). By quantization of the level of hybridization complexes 
and calibrating the quantified results it is possible also to detect the level of the 
transcript in the sample. If desired, the detected level may be compared to that of 
20 the original sequence or compared to that of other splice variants, for example, 
those obtained from the same original sequence by alternative splicing. 

Methods for detecting mutations in the region coding for the variant product 
are also provided, which may be methods carried-out in a binary fashion, namely 
merely detecting whether there is any mismatches between the normal variant 
25 nucleic acid sequence of the invention and the one present in the sample, or 
carried-out by specifically detecting the nature and location of the mutation. 

The present invention also concerns a method for detecting variant product 
in a biological sample, comprising the steps of: 

(a) contacting with said biological sample the antibody of the invention, 
30 thereby forming an antibody-antigen complex; and 




• # 

^ 17- 



(b) detecting said antibody-antigen complex 

wherein the presence of said antibody-antigen complex correlates with the 
presence of variant product in said biological sample. 

As indicated above, the method can be quantitized to determine the level or 
5 the amount of the variant in the sample, alone or in comparison to the level of the 
original amino acid sequence from which it was varied, and qualitative and 
quantitative results may be used for diagnostic, prognostic and therapy planning 
purposes. 

By yet another aspect the invention also provides a method for identifying 
10 candidate compounds capable of binding to the variant product and modulating its 
activity (being either activators or deactivators). The method includes: 

(i) providing a protein or polypeptide comprising an amino acid 
sequence substantially as coded by any one of SEQ ID NO:l to 48611, or a 
fragment of such a sequence; 
15 (ii) contacting a candidate compound with said amino acid sequence; 

(iii) measuring the physiological effect of said candidate compound on 
the activity of the amino acid sequences and selecting those compounds which 
show a significant effect on said physiological activity. 

The present invention also concerns compounds identified by the above 
20 methods described above, which compound may either be an activator of the 
variant product or a deactivator thereof. 

As indicated above, the novel variants of the invention fall under 43 
functional groups. 

These groups have been defined by the activity of the original sequences 
25 from which the variants have been varied. The name of the group, its function, the 
number of the original sequences (genes) falling under that group, the number of 
splice variants falling under that group and the SEQ ED NOS. of the variants are 
given in Table 1 below. 
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FUNCTIONAL GROUP 
NAME 


Total of 

original 

sequences 


Total of 

New 

Variants 


# of New 
Variants 


Description of the proteins 


ADAPTOR_BINDING 


442 


5525 


1-5525 


Proteins that are associated to 
other cell components, either by 
binding, interacting, or 
associating to them. This 
interaction is necessary for the 
protein's activity and/or 
structure. 


ADHESION 


72 


1054 


5526-6579 


Proteins that serve as adhesion 
molecules between adjoining 
cells 


APOLIPOPROTEINS 


9 


202 


6580-6781 


Proteins that are part of the 
lipoprotein particle and act as a 
recognition signal for the 
cellular binding and 
internalization of these 
particles. 


APOPTOSIS 


43 


645 


6782-7426 


Proteins and enzymes that are 
involved in the apoptosis 
pathway, either by inducing or 
inhibiting it. 


CANCER 


224 


2659 


7427-10085 


Proteins that are involved in 
cancer; oncogenes, DNA repair 
proteins, tumor markers and 
antigens, tumor suppressors, 
and cellular second messengers 
that participate in cancer. 


CARBOXYLASE 


17 


301 


10086-10386 


Enzymes that add or remove 
CO2 groups 


CD 


38 


376 


10387-10762 


Cell surface antigens 


CELL_CYCLE 


64 


677 


10763-11439 


Proteins and enzymes involved 
in controlling the cell cycle 
pathway, cellular growth, cell 
division, and cellular 
progression. 


COAGULATION 


8 


24 


11440-11463 


Proteins involved in the blood 
coagulation pathway 


CONVERTING_ 


7 


109 


11464-11572 


Enzymes that convert one 
protein to another by specific 
cleavage of the precurser 
protein. 


CYCLASE 


8 


27 


11573-11599 


Enzymes that convert 
triphosphate to cyclic 
monophosphate 
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DEGRADATION 


69 


906 


11600-12505 


Protein*? and r^llnlur An-7t r *« nn 
cuiu ceil Ul dl ciLzyrncs 

involved in the degradation 




DEVELOPMENTAL 


21 


143 




process of other nroteins 




DISEASE_RELATED 


79 


856 


12506-12648 
12649-13504 


Proteins effecting development 
Proteins involved in a certain 
diseased s), either h v 
contributing to it, or by acting 
as a marker for it 




T*\ /~\ 1» m A TXT 

DOMAIN 


81 


655 


13505-14159 


proteins in vol veH \ n 
protein-protein interactions 




ESTERASE 


30 


209 


14160-14368 


Enzymes cleaving the ester 
bond between a rhpmi^oi 
residue and a protein. 




trKO W I H_F ACTORS 


58 


630 


14369-14998 


Growth factors, cytokines, 
interleukins, interferons, and 













lymphokines 




HORMONES 


51 


492 


14999-15490 


Hormones, poietin proteins 




HO USEKEEPING 


49 


405 


15491-15895 


Homeobox, heat shock proteins 


J 










and factors, chaperonin 




HYDRO 


99 


1215 


15896-17110 


Enzymes that modify the" 










hydroxyl group, such as 












ixytuu^cnabc, uenyarogenase, 
hydrolase, and hydroxylase 




IMMUNO 


113 


1529 


17111-18639 


Prnfpin<2 that QrP inx/r\1*7£*/4 i »-» 

± luiciui nidi arc lnvoiveo. in trie 


1 










lAiAiii uj.it- axxu complement 
systems such as; antigens and 
autoantiffens imirmnnalnHiilino 
MHC and HLA proteins and 
their associated proteins 




INHIBITORS 


87 


1127 


18640-19766 


Inhibitors and suppressors of 
other proteins and enzymes. 




KINASE 


275 


3077 


19767-22843 


kinase 




LIPASE 


23 


238 


22844-23081 


lipase, phospholipase, and 
vsonhosoholinasp 




MATRIX 


351 


4224 


23082-27305 


All proteins compromising the 
cell matrix and cvtn^k'elptrm 




MODIFYING_ENZYMES 


207 


2103 


27306-29408 


Miscellaneous enzymes such as 
paraoxonase, GTPase, ATPase, 
anhydrase. 




MUTASE 


7 


77 


29409-29485 


Mutases and superoxide 
dismutase. 




NEURO 


61 


429 


29486-29914 


CNS related proteins and 
enzymes 




OXIDASE 


44 


520 


29915-30434 


Oxidase and peroxidase 
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OXYGENASE 


12 


141 


30435-30575 


Oxygenase, mono- anrl Hir> 

oxygenase 


PHOSPHATASE 


88 


884 


30576-31459 


Phosphatases and 
phosphorylases 


PHOSPHOPROTEINS 


22 


294 


31460-31753 


Phosphoproteins and 
phospholipids 


PROTEASE 


113 


1392 


31754-33145 


Proteases, peptidases, and 
proteinases. 


RECEPTORS 


205 


1684 


33146-34829 


Receptors 


REDUCTASE 


60 


721 


34830-35550 


Reductases 


SECRETED FACTORS 


23 


110 


35551-35660 


Secreted proteins 


SIGNAL_TRANSDUCTION 


51 


490 


35661-36150 


Proteins that participate in 
signal transduction; such as G 
proteins, 


SUBCELLULAR 


53 


975 


36151-37125 


uuuwiiuiai LUiJlCJ.Il;> bLlCIl 3S 

ribosomal proteins 


SYNTASE 


88 


1255 


37126-38380 


Syntase, sythases, synthetase 


TRANSCRIPTIONAL 
RNA DNA 


502 


6750 


38381-45130 


i^uu±c<xi proteins involved in 
RNA and DNA, such as 
transcription factors, RNA and 
DNA binding proteins, zinc 
i.iii^cii>, iiencase, isomerase, 
histones, nucleases, 


TRANSFER 


142 


1423 ■ 


45131-46553 


Proteins involved in 
TRANSFER of functional 
groups 


TRANSLATIONAL 
FACTORS 


30 


476 


46554-47029 


Proteins and enzymes involved 
in the translational process 
such as elongation and 
initiation factors 


TRANSPORTER 




171 


1582 


47030-48611 


Proteins that mediate the 
transport of molecules and 
macromoleules, such as 
channels, exchangers, pumps. 



The pharmaceutical compositions, whether comprising the nucleic acid 
5 sequences of the variants of the invention themselves (alone or in an expression 
vector), comprising complementary sequences thereto (alone or in an expression 
vector), comprising the amino acid (products), or alternatively, comprising 
antibodies to the above, are suitable for the treatment of a plurality of diseases, each 
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one in accordance with the activity of the functional group to which the new variant 
falls. 

The detection of diseases utilizing a variant probe (comprising the variant 
sequence or a sequence complementary thereto) or alternatively comprising an 
5 amino acid sequence reactive with the variant product is also in accordance with the 
functional group to which the variants belong. 

Thus, in the following, there shall be a brief summary of those conditions, 
and diseases in which the pharmaceutical composition can treat, i.e. cure, 
ameloriate or prevent, as well as those conditions which can be detected by variant 
o probes of the present invention, or by antibodies reactive with the variant product of 
the invention. 



Group 1 - Adaptor-binding - (SEQ ID NO: 1-5525), the pharmaceutical 
compositions (comprising all aspects as indicated above) and the probes/antibodies 
15 may treat or detect, respectively, pathological conditions which are associated with 
non-normal protein activity or structure. Binding of the products of the variants of 
this family, or antibodies reactive therewith, can modulate a plurality of protein 
activities as well as change protein structure. 

Group 2 -Adhesion - (SEQ ID NO:5526-6579), the pharmaceutical compositions 
20 (including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, conditions in which adhesion 
between adjoining cells is involved, typically conditions in which the adhesion is 
non-normal. Typical examples of such conditions are cancer conditions in which 
25 non-normal adhesion may cause and enhance the process of metastasis. Other 
examples of such conditions include conditions of non-normal growth and 
development of various tissues in which modulation adhesion among adjoining 
cells can improve the condition. 

Group 3 - Apolipoproteins - (SEQ ID NO:6580-6781), the pharmaceutical 
30 compositions (including the variant sequence, the product, a sequence 



# 
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complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which involve non-normal lipoprotein particles signaling, cellular binding 
and internalization, such as diseases which involve abnormally high or low levels 
of lipoprotein and cholesterol, as well as conditions involved in the formation or 
artherosclerosis. 

Group 4 - Apoptosis - (SEQ ID NO:6782-7426), the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence and 
antibody may serve to treat, or detect, respectively, diseases which are involved in 
premature death of cells, such as degenerative diseases, for example 
neurodegenerative diseases or conditions associated with aging, or alternatively, 
diseases wherein apoptosis which should have taken place, does not take place. 
Example of such diseases are cancerous diseases. 

Group 5 - Cancer diseases - (SEQ ID NO:7427- 10085) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibodies may serve to treat, or detect, respectively, 
cancer diseases as well as metastasis or prevent cancer diseases. The detection may 
also be for pre-disposition to a disease, as well as for determination of the stage of 
thedisease. 

Group 6 - Carboxylases, (SEQ ID NO: 10086-10386), the pharmaceutical 
compositions (including the variant sequence, the product, the sequence 
complementary to the variant sequence or an antibody to the product), and a probe 
variant sequence may serve to treat, or detect, respectively, these diseases which 
can be ameliorated or improved by regulation of enzymatic reactions which remove 
C0 2 groups from other moieties, notably be removal of C0 2 from protein. 
Group 7 - CD's, (SEQ ID NO: 10387-10762), the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
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antibody may serve to treat, or detect, respectively, diseases in which regulation of 
the recognition, or participation or bind of cell surface antigens to other moieties 
may improve the disease. These diseases include autoimmune diseases, various 
infectious diseases, cancer diseases which involve non cell surface antigens 
5 recognition and activity, etc. 

Group 8 - Cell cycle, (SEQ ID NO: 10763- 11439), the pharmaceutical compositions 
(including the variant sequence, the product, the sequence complementary to the 
variant sequence or an antibody to the product), and a probe variant sequence may 
serve to treat, or detect, respectively, diseases which are manifested, or involved in 
10 non-normal cell cycle pathways, non-normal cellular growth division and 
progression. Typically these diseases are manifested either by degenerative diseases 
(low growth), or on the other hand by cancerous diseases (uncontrolled growth). 
Group 9 - Coagulation, (SEQ ID NO: 11440-1 1463), the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
15 complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which are manifested by non-normal coagulation processes, which may 
include abnormal bleeding or excessive coagulation. 

Group 10 - converting enzymes, (SEQ ID NO: 11 464- 11 572), the pharmaceutical 
20 compositions (including- the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which are manifested by non-normal conversion of one protein to the other 
due to lack or excessive cleavage of the specific precursor protein. 
25 Group 11 - Cyclase, (SEQ ID NO: 1 1573-1 1599), the pharmaceutical compositions 
(including the variant sequence, the product, the sequence complementary to the 
variant sequence or an antibody to the product), and a probe variant sequence may 
serve to treat, or detect, respectively, diseases which are manifested by non-normal 
(excessive, or lack of) conversion of trisphosphate to cyclic monophosphate, as 
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well as diseases in which internal-cell signaling, caused by the above conversion is 
non-normal. 

Group 12 - Degradation, (SEQ ID NO: 11600-12505) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases in which there is abnormal degradation of other proteins, which may cause 
non-normal accumulation of various proteinaceous products in cells, caused non- 
normal (prolonged or shortened) activity of proteins, etc. 

Group 13 - Developmental, (SEQ ID NO: 12506-12648), the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which are manifested by non-normal development, which may be 
non-normal development of the organism (genetic diseases involving non-normal 
development of a fetus), non-normal development of a tissue (a tissue which is not 
properly developed) as well as cancer diseases. 

Group 14 - Disease-related, (SEQ ID NO: 12649-13504), the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, a 
variety of different diseases. 

Group 15 - Domain proteins, (SEQ ID NO: 13505-14159), the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases in which the protein-to-protein interactions are non-normal, due to 
excessive interaction, insufficient interaction or lack of proper interaction between 
proteins. 
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Group 16 - Esterase, (SEQ ID NO: 14160-14368), the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, to diseases and pathological 
5 conditions which may be ameliorated by modulating the rate, or the ctivity of 
cleavage of ester bonds between any chemical residue and a protein. 
Group 17 - Growth factors, (SEQ ID NO: 143 69- 14998), the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
io probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which may be ameliorated by modulating the concentration or ativity or 
interaction binding, etc. of growth factors, cytokines, interleukins, interferon and 
lymphokines, typically such diseases such as autoimmune diseases, inflammation 
related disease, Graff vs. Host diseases, diseases caused by infectious agents, cancer 
15 diseases, as well as disease originating from improper concentration of growth 
factors causing non-normal (either excessive or too little of) growth of various 
tissues themselves, or causing untimely death of a desired cell population. 
Group 18 - Hormones, (SEQ ID NO: 14999-1 5490) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
20 complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which are endocrine in essence (cause or are a result of hormones), or may 
be ameliorated by raising, or decreasing the level of hormones and poietin protein. 
Group 19 - Housekeeping, (SEQ ID NO: 1549 1-15895) the pharmaceutical 
25 compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
conditions in which a beneficial effect is evident by regulating the expression of a 
protein necessary for survival. 
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Group 20 - Hydro, (SEQ ID NO: 15896-171 10) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases in which the activity 
5 connected with hydroxyl groups such as hydrogenation, dehydrogenation, 
hydrolation, and hydroxylation activity is non-normal (increased or decreased). 
Group 21 - Immuno, (SEQ ID NO: 171 1 1-18639) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
io antibody may serve to treat, or detect, respectively, diseases involving the 
immunological system including inflammation, autoimmune diseases, infectious 
diseases, as well as cancerous processes. 

Group 22 - Inhibitors, (SEQ ID NO: 18640- 19766) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
15 complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases in which beneficial effect may be achieved by modulating the activity of 
inhibitors and suppressors of proteins and enzymes. 

Group 23 - Kinase, (SEQ ID NO: 19767-22843), the pharmaceutical compositions 
20 (including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases which may be 
ameliorated by a modulating kinase activity, which is one of the main signaling 
pathways inside cell. 

25 Group 24 - Lipase, (SEQ ID NO:22844-23081), the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence 
may serve to treat, or detect, respectively, diseases which involve non-normal 
metabolism activity or interactions of lipases. 
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Group 25 - Matrix, (SEQ ID NO:23 082-27305), the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases which are caused or 
5 due to abnormalities in cytoskelaton, including cancerous cells, and diseased cells 
including those which do not propagate, grow or function normally. 
Group 26 - Modifying enzymes, (SEQ ID NO:27306-29408) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
io probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which can be ameliorated by modulating the activity of various enzymes 
such as GTPases, AT Pases, anhydrases and paraoxonases and various enzymes 
which are involved both in enzymatic processes inside cells as well as in cell 
signaling. 

1 5 Group 2 7 - Mutase, (SEQ ID NO: 29409-29485) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases involving mutases and 
superoxidedismutases, including cancer diseases, and various other pathological 

20 processes connected with aging. 

Group 28 - Neuro, (SEQ ID NO: 28486-29914) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases involving the central 

25 nervous system, including diseases involved in various types of dementia, 
neurodegenerative diseases, etc., diseases involving epilepsy, various psychiatric 
disorders, etc., cancer of neural origin. 

Group 29 - Oxidase, (SEQ ID NO: 29915-30434) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
30 variant sequence or an antibody to the product), and the probe variant sequence or 
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antibody may serve to treat, or detect, respectively, diseases caused by non-normal 
activity of improved oxidases and peroxidases. 

Group 30 - Oxygenase, (SEQ ID NO:30435-30575) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases involving non-normal activity of oxygenases, mono- and dio-oxygenases. 
Group 31 - Phosphatase, (SEQ ID NO:30576-31459) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which can be ameliorated or cured by modulating the activity of 
phosphatases and phosphorylases. 

Group 32 - Phosphoproteins, (SEQ ID NO:3 1460-3 1753) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which involve phosphoproteins and phospholipids, i.e. diseases which are 
caused by an excess of, lack of or non-normal phosphoproteins or phospholipids. 
Group 33 -Protease, (SEQ ID NO:3 1754-33 145) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases which can be 
ameliorated by modulating the activity of proteases, peptidases and proteinases. 
Group 34 - Receptors, (SEQ ID NO:33 146-34829) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases involving various receptors present on various membranes in different 
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tissues of the body, including receptors to neurotransmitters, hormones and various 
other effectors and ligands. 

Group 35 - Reductase, (SEQ ID NO: 34830-35550) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
5 complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases involving the reductases enzymes. 

Group 36 - Secreted-factors, (SEQ ID NO.35551-35660) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
10 complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases which involve non-normal secretion of proteins which may be due to 
non-normal presence, absence or non-normal response to normal levels of secreted 
proteins including hormones, neurotransmitters, and various other proteins secreted 
15 by cells to the extracellular environment. 

Group 37 - Signal-transduction, (SEQ ID NO:35661-36150) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 
20 diseases in which the signal-transduction, typically involving G-proteases is 
non-normal, either as a cause, or as a result of the disease. 

Group 38 - Sub-cellular, (SEQ ID NO: 36151-37125) the pharmaceutical 
compositions (including the variant sequence, the product, the sequence 
complementary to the variant sequence or an antibody to the product), and a probe 
25 variant sequence may serve to treat, or detect, respectively, diseases involving 
non-normal sub-cellular proteins such as non-normal ribozymal protein. 
Group 39 - Synthase, (SEQ ID NO: 37126-38380) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
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probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases in which the synthase, synthethase activity should be modulated. 
Group 40 - Transcriptional RNA-DNA, (SEQ ID NO:3 838 1-45 130) the 
pharmaceutical compositions (including the variant sequence, the product, a 
5 sequence complementary to the variant sequence or an antibody to the product), 
and the probe variant sequence or antibody may serve to treat, or detect, 
respectively, diseases involving transcription factors such as: helicases, isomerases, 
histones and nucleases, for example diseases where there is non-normal replication 
or transcription of DNA and RNA respectively. 

io Group 41 -Transfer, (SEQ ID NO:45 13 1-46553) the pharmaceutical compositions 
(including the variant sequence, the product, a sequence complementary to the 
variant sequence or an antibody to the product), and the probe variant sequence or 
antibody may serve to treat, or detect, respectively, diseases in which the transfer of 
functional group to a modulated moiety is not normal so that a beneficial effect may 

15 be achieved by modulation of such transfer. 

Group 42 - Translational-factors, (SEQ ID NO:46554-47029) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
probe variant sequence or antibody may serve to treat, or detect, respectively, 

io diseases in which the translation, elongation and initiation is non-normal leading to 
various pathological conditions. 

Group 43 - Transporters, (SEQ ID NO:47030-48611) the pharmaceutical 
compositions (including the variant sequence, the product, a sequence 
complementary to the variant sequence or an antibody to the product), and the 
5 probe variant sequence or antibody may serve to treat, or detect, respectively, 
diseases in which the transport of molecules and macromolecules such as 
neurotransmitters, hormones, sugar etc. is non-normal leading to various 
pathologies. 
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The present invention further concerns any one of SEQ ID NO:l to SEQ ID 
NO:48611 present on a data carrier. The invention further concerns the amino cid 
sequences present on a data carrier. 

The present invention further concerns such a data carrier for use in an 
5 analysis of a nucleic acid sequence or amino acid sequence. For the purpose of the 
analysis said nucleic acid sequence is compared to a sequence of a plurality of 
nucleic acid sequences being substantially SEQ ID NO: 1 to SEQ ID NO.48611 of 
which are present on a data carrier or alternatively to the plurality of amino acid 
sequences present on the carrier. Thus, the data carrier of the invention may be used 
10 by others for analysis of nucleic acid sequences which they have, in order to 
determine whether the sequence they have is a sequence of splice variants of a 
known gene, obtained through alternative splicing. 

This may be done by using a software data combination comprising a 
nucleotide search and comparison software and a data carrier comprising all of the 
15 variant sequences of the invention. When the combination is loaded into the 
computer it can execute a search where a nucleotide sequence entered by the user is 
compared to the plurality of sequences comprising said data. 

The software used for search and comparison between nucleic acid 
sequences is in combination with the data of the invention, may be any software 
20 known in the art for finding homology, at a specified level between an entered 
nucleic acid sequence and a plurality of nucleic acid sequences present on a data 
base any person wishing to determine whether a nucleic acid sequence he has is a 
splice variant of one of the original sequence, may do so by determining whether it 
appears in one of the sequences of the invention. 

25 



DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 
Example 1: Explanation of the CD-ROM 

The CD-ROM contains basically three files. A first file termed (protein 
family name)_for_patent include all the nucleic acid sequences of the invention. 
The sequences are arranged according to the functional (family) group, so that 
there exists 43 files, each one is named in accordance with the functional group 
name, as depicted in Table 1, for example, the first group is named, 
"Adaptor Jbr _patent"\ 

This group includes all splice variants from SEQ ID NO:l to SEQ ID 
NO:5525 (termed in the sequence as NV_1 to NV_5525). 

The second file is named "Adhesion Jor jpatent", and includes those 
sequences from SEQ ID,NO:5526-6579, termed NV_5526 to NV_6579 etc. 

The amino acid sequences come inside each directory named (protein 
family name)_pep_patent. Thus there are two different files: 1. protein family 
name_for_patent (containing nucleotides); and 2. (protein family 
name)_pep_patent (containing amino acids) and are preceded by the same 
NV_number. 

Preceding the actual sequence of each novel variant of the invention is the 
name of the original sequence from which it is varied, as well as the GenBank 
accession number of the original gene. 

Between the name of the original sequence and its accession number, is 
present in the number of the novel variant of the invention preceded by NV_. 
Thus SEQ ID NO:l is marked as - NV_1.;SEQ ID NO:2 is marked as NVJ2, etc. 

Since many times several novel variant sequences originate from the same 
original sequence, all of these novel variants originating from the same origin 
will be preceded by the description of the same original sequence and its 
accession number repeated again and again. 

The CD-ROM also includes "Table_summary_new.doc" (which is 
identical in fact to Table 1). 



^33- 

Another table (file) present on the CD-ROM is (IP_OctOO.mdb) (Table 
2). This table contains the names of all new variants, arranged by their SEQ ID 
NO., beginning from NV l and ending in NV 48611. After each new ID No 
there is the "Old ID" which is the number of the sequence (NVJ as appeared in 
5 the priority document. For example, NV_4 in the priority document is NVJ in 
the present case. After the variant indexes comes the description given in 
GeneBank of the original sequence from which it has been varied, and after it the 
accession number of the original sequence from which it has been varied. Where 
several novel variants are varied from the same original sequence, the description 
10 and accession number of several consecutive lines will be identical. This table 
"IP Oct OO.mdb " (Table 2) can be used for both nucleotides and amino acids. 

Table 3 termed: "Clear Patent l.doc", concerns the NV_(or SEQ ID) 
Nos. of the priority document versus those of the present application. 

15 Example II: Variant nucleic acid sequence 

The nucleic acid sequences of the invention include nucleic acid 
sequences which encode variant product and fragments and analogs thereof. The 
nucleic acid sequences may alternatively be sequences complementary to the 

20 above coding sequence, or to a region of said coding sequence. The length of the 
complementary sequence is sufficient to avoid the expression of the coding 
sequence. The nucleic acid sequences may be in the form of RNA or in the form 
of DNA, and include messenger RNA, synthetic RNA and DNA, cDNA, and 
genomic DNA. The DNA may be double-stranded or single-stranded, and if 

25 single-stranded may be the coding strand or the non-coding (anti-sense, 
complementary) strand. The nucleic acid sequences may also both include 
dNTPs, rNTPs as well as non naturally occurring sequences. The sequence may 
also be a part of a hybrid between an amino acid sequence and a nucleic acid 
sequence. 
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In a general embodiment, the nucleic acid sequence has at least 90%, 
identity or 45% with any one of the sequence identified as SEQ ID NO: 1 to SEQ 
ID NO: 4861 1 provided that this sequence is not completely identical with that of 
the original sequence. 
5 The nucleic acid sequences may include the coding sequence by itself. By 

another alternative the coding region may be in combination with additional 
coding sequences, such as those coding for fusion protein or signal peptides, in 
combination with non-coding sequences, such as introns and control elements, 
promoter and terminator elements or 5' and/or 3' untranslated regions, effective 

10 for expression of the coding sequence in a suitable host, and/or in a vector or host 
environment in which the variant nucleic acid sequence is introduced as a 
heterologous sequence. 

The nucleic acid sequences of the present invention may also have the 
product coding sequence fused in-frame to a marker sequence which allows for 

15 purification of the variant product. The marker sequence may be, for example, a 
hexahistidine tag to provide for purification of the mature polypeptide fused to 
the marker in the case of a bacterial host, or, the marker sequence may be a 
hemagglutinin (HA) tag when a mammalian host, e.g. COS-7 cells, is used. The 
HA tag corresponds to an epitope derived from the influenza hemagglutinin 

20 protein (Wilson, L, et al Cell 31:161 (1984)). 

Also included in the scope of the invention are fragments as defined above 
also referred to herein as oligonucleotides, typically having at least 20 bases, 
preferably 20-30 bases corresponding to a region of the coding-sequence nucleic 
acid sequence. The fragments may be used as probes, primers, and when 

25 complementary also as antisense agents, and the like, according to known 
methods. 

As indicated above, the nucleic acid sequence may be substantially a 
depicted in any one of SEQ ID NO: 1 to SEQ ID NO:4861 1 or fragments thereof 
or sequences having at least 90% identity to the above sequence as explained 
30 above. Alternatively, due to the degenerative nature of the genetic code, the 
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sequence may be a sequence coding for any one of the amino acid sequence 
coded by the sequence of SEQ ID NO:l to SEQ ID NO:48611, or fragments or 
analogs of said amino acid sequence. 

5 A. Preparation of nucleic acid sequences 

The nucleic acid sequences may be obtained by screening cDNA libraries 
using oligonucleotide probes which can hybridize to or PCR-amplify nucleic acid 
sequences which encode the variant products disclosed above. cDNA libraries 
prepared from a variety of tissues are commercially available and procedures for 
10 screening and isolating cDNA clones are well-known to those of skill in the art. 
Such techniques are described in, for example, Sambrook et al. (1989) Molecular 
Cloning: A Laboratory Manual (2nd Edition), Cold Spring Harbor Press, 
Plainview, N.Y. and Ausubel FM et al. (1989) Current Protocols in Molecular 
Biology, John Wiley & Sons, New York, N.Y. 
15 The nucleic acid sequences may be extended to obtain upstream and 

downstream sequences such as promoters, regulatory elements, and 5' and 3' 
untranslated regions (UTRs). Extension of the available transcript sequence may 
be performed by numerous methods known to those of skill in the art, such as 
PCR or primer extension (Sambrook et al, supra), or by the RACE method 

20 using, for example, the Marathon RACE kit (Clontech, Cat. # K1802-1). 

Alternatively, the technique of "restriction-site" PCR (Gobinda et al. PCR 
Methods Applic. 2:318-22, (1993)), which uses universal primers to retrieve 
flanking sequence adjacent a known locus, may be employed. First, genomic 
DNA is amplified in the presence of primer to a linker sequence and a primer 

25 specific to the known region. The amplified sequences are subjected to a second 
round of PCR with the same linker primer and another specific primer internal to 
the first one. Products of each round of PCR are transcribed with an appropriate 
RNA polymerase and sequenced using reverse transcriptase. 

Inverse PCR can be used to amplify or extend sequences using divergent 

30 primers based on a known region (Triglia, T. et al, Nucleic Acids Res. 16:8186, 
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(1988)). The primers may be designed using OLIGO(R) 4.06 Primer Analysis 
Software (1992; National Biosciences Inc, Plymouth, Minn.), or another 
appropriate program, to be 22-30 nucleotides in length, to have a GC content of 
50% or more, and to anneal to the target sequence at temperatures about 68-72°C. 
The method uses several restriction enzymes to generate a suitable fragment in 
the known region of a gene. The fragment is then circularized by intramolecular 
ligation and used as a PCR template. 

Capture PCR (Lagerstrom, M. et al, PCR Methods Applic. 1:111-19, 
(1991)) is a method for PCR amplification of DNA fragments adjacent to a 
known sequence in human and yeast artificial chromosome DNA. Capture PCR 
also requires multiple restriction enzyme digestions and ligations to place an 
engineered double-stranded sequence into a flanking part of the DNA molecule 
before PCR. 

Another method which may be used to retrieve flanking sequences is that 
of Parker, J.D., etal, Nucleic Acids Res., 19:3055-60, (1991)). Additionally, one 
can use PCR, nested primers and PromoterFinder™ libraries to "walk in" 
genomic DNA (PromoterFinder™; Clontech, Palo Alto, CA). This process 
avoids the need to screen libraries and is useful in finding intron/exon junctions. 
Preferred libraries for screening for full length cDNAs are ones that have been 
size-selected to include larger cDNAs. Also, random primed libraries are 
preferred in that they will contain more sequences which contain the 5' and 
upstream regions of genes. 

A randomly primed library may be particularly useful if an oligo d(T) 
library does not yield a full-length cDNA. Genomic libraries are useful for 
extension into the 5' nontranslated regulatory region. 

The nucleic acid sequences and oligonucleotides of the invention can also 
be prepared by solid-phase methods, according to known synthetic methods. 
Typically, fragments of up to about 100 bases are individually synthesized, then 
joined to form continuous sequences up to several hundred bases. 
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B. Use of variant nucleic acid sequence for the production of 
variant products 

In accordance with the present invention, nucleic acid sequences specified 
5 above may be used as recombinant DNA molecules that direct the expression of 
variant products. 

As will be understood by those of skill in the art, it may be advantageous 
to produce variant product-encoding nucleotide sequences possessing codons 
other than those which appear in any one of SEQ ID NO: 1 to SEQ ID NO:4861 1 

10 which are those which naturally occur in the human genome. Codons preferred 
by a particular prokaryotic or eukaryotic host (Murray, E. et al Nuc Acids Res., 
17:477-508, (1989)) can be selected, for example, to increase the rate of variant 
product expression or to produce recombinant RNA transcripts having desirable 
properties, such as a longer half-life, than transcripts produced from naturally 

15 occurring sequence. 

The nucleic acid sequences of the present invention can be engineered in 
order to alter a variant product coding sequence for a variety of reasons, 
including but not limited to, alterations which modify the cloning, processing 
and/or expression of the product. For example, alterations may be introduced 
20 using techniques which are well known in the art, e.g., site-directed mutagenesis, 
to insert new restriction sites, to alter glycosylation patterns, to change codon 
preference, etc. 

The present invention also includes recombinant constructs comprising 
one or more of the sequences as broadly described above. The constructs 

25 comprise a vector, such as a plasmid or viral vector, into which a nucleic acid 
sequence of the invention has been inserted, in a forward or reverse orientation. 
In a preferred aspect of this embodiment, the construct further comprises 
regulatory sequences, including, for example, a promoter, operably linked to the 
sequence. Large numbers of suitable vectors and promoters are known to those 

30 of skill in the art, and are commercially available. Appropriate cloning and 
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expression vectors for use with prokaryotic and eukaryotic hosts are also 
described in Sambrook, et aL, (supra). 

The present invention also relates to host cells which are genetically 
engineered with vectors of the invention, and the production of the product of the 
invention by recombinant techniques. Host cells are genetically engineered (i.e., 
transduced, transformed or transfected) with the vectors of this invention which 
may be, for example, a cloning vector or an expression vector. The vector may 
be, for example, in the form of a plasmid, a viral particle, a phage, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants or amplifying the 
expression of the variant nucleic acid sequence. The culture conditions, such as 
temperature, pH and the like, are those previously used with the host cell selected 
for expression, and will be apparent to those skilled in the art. 

The nucleic acid sequences of the present invention may be included in 
any one of a variety of expression vectors for expressing a product. Such vectors 
include chromosomal, nonchromosomal and synthetic DNA sequences, e.g., 
derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast 
plasmids; vectors derived from combinations of plasmids and phage DNA, viral 
DNA such as vaccinia, adenovirus, fowl pox virus, and pseudorabies. However, 
any other vector may be used as long as it is replicable and viable in the host. 
The appropriate DNA sequence may be inserted into the vector by a variety of 
procedures. In general, the DNA sequence is inserted into an appropriate 
restriction endonuclease site(s) by procedures known in the art. Such procedures 
and related sub-cloning procedures are deemed to be within the scope of those 
skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an 
appropriate transcription control sequence (promoter) to direct mRNA synthesis. 
Examples of such promoters include: LTR or SV40 promoter, the E.coli lac or 
trp promoter, the phage lambda PL promoter, and other promoters known to 
control expression of genes in prokaryotic or eukaryotic cells or their viruses. 
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The expression vector also contains a ribosome binding site for translation 
initiation, and a transcription terminator. The vector may also include 
appropriate sequences for amplifying expression. In addition, the expression 
vectors preferably contain one or more selectable marker genes to provide a 
phenotypic trait for selection of transformed host cells such as dihydrofolate 
reductase or neomycin resistance for eukaryotic cell culture, or such as 
tetracycline or ampicillin resistance in E.coli. 

The vector containing the appropriate DNA sequence as described above, 
as well as an appropriate promoter or control sequence, may be employed to 
transform an appropriate host to permit the host to express the protein. Examples 
of appropriate expression hosts include: bacterial cells, such as E.coli, 
Streptomyces, Salmonella typhimurium; fungal cells, such as yeast; insect cells 
such as Drosophila and Spodoptera Sf9; animal cells such as CHO, COS, HEK 
293 or Bowes melanoma; adenoviruses; plant cells, etc. The selection of an 
appropriate host is deemed to be within the scope of those skilled in the art from 
the teachings herein. The invention is not limited by the host cells employed. 

In bacterial systems, a number of expression vectors may be selected 
depending upon the use intended for the variant product. For example, when 
large quantities of variant product are needed for the induction of antibodies, 
vectors which direct high level expression of fusion proteins that are readily 
purified may be desirable. Such vectors include, but are not limited to, 
multifunctional E.coli cloning and expression vectors such as Bluescript(R) 
(Stratagene), in which the variant polypeptide coding sequence may be ligated 
into the vector in-frame with sequences for the amino-terminal Met and the 
subsequent 7 residues of beta-galactosidase so that a hybrid protein is produced; 
pIN vectors (Van Heeke & Schuster J. Biol. Chem. 264:5503-5509, (1989)); pET 
vectors (Novagen, Madison WI); and the like. 

In the yeast Saccharomyces cerevisiae a number of vectors containing 
constitutive or inducible promoters such as alpha factor, alcohol oxidase and 



PGH may be used. For reviews, see Ausubel et al. (supra) and Grant et al, 
{Methods in Enzymology 153:516-544, (1987)). 

In cases where plant expression vectors are used, the expression of a 
sequence encoding variant product may be driven by any of a number of 
promoters. For example, viral promoters such as the 35S and 19S promoters of 
CaMV (Brisson et al, Nature 310:511-514. (1984)) may be used alone or in 
combination with the omega leader sequence from TMV (Takamatsu et al, 
EMBO J., 6:307-311, (1987)). Alternatively, plant promoters such as the small 
subunit of RUBISCO (Coruzzi et al., EMBO J. 3:1671-1680, (1984); Broglie et 
al, Science 224:838-843, (1984)); or heat shock promoters (Winter J and 
Sinibaldi R.M., Results Probl Cell Differ., 17:85-105, (1991)) may be used. 
These constructs can be introduced into plant cells by direct DNA transformation 
or pathogen-mediated transfection. For reviews of such techniques, see Hobbs S. 
or Murry L.E. (1992) in McGraw Hill Yearbook of Science and Technology, 
McGraw Hill, New York, N.Y., pp 191-196; or Weissbach and Weissbach 
(1988) Methods for Plant Molecular Biology, Academic Press, New York, N.Y., 
pp 421-463. 

Variant product may also be expressed in an insect system. In one such 
system, Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a 
vector to express foreign genes in Spodoptera frugiperda cells or in Trichoplusia 
larvae. The variant product coding sequence may be cloned into a nonessential 
region of the virus, such as the polyhedrin gene, and placed under control of the 
polyhedrin promoter. Successful insertion of variant coding sequence will render 
the polyhedrin gene inactive and produce recombinant virus lacking coat protein 
coat. The recombinant viruses are then used to infect S. frugiperda cells or 
Trichoplusia larvae in which variant protein is expressed (Smith et al, J. Virol 
46:584, (1983); Engelhard, E.K. et al, Proc. Nat Acad Set 91:3224-7, (1994)). 

In mammalian host cells, a number of viral-based expression systems may 
be utilized. In cases where an adenovirus is used as an expression vector, a 
variant product coding sequence may be ligated into an adenovirus 
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transcription/translation complex consisting of the late promoter and tripartite 
leader sequence. Insertion in a nonessential El or E3 region of the viral genome 
will result in a viable virus capable of expressing variant protein in infected host 
cells (Logan and Shenk, Proc. Natl. Acad. Set 81:3655-59, (1984). In addition, 
5 transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, may be 
used to increase expression in mammalian host cells. 

Specific initiation signals may also be required for efficient translation of 
a variant product coding sequence. These signals include the ATG initiation 
codon and adjacent sequences. In cases where variant product coding sequence, 
10 its initiation codon and upstream sequences are inserted into the appropriate 

jL_§ 

y3 expression vector, no additional translational control signals may be needed, 

fy However, in cases where only coding sequence, or a portion thereof, is inserted, 

S exogenous transcriptional control signals including the ATG initiation codon 

,JJ must be provided. Furthermore, the initiation codon must be in the correct 

% 15 reading frame to ensure transcription of the entire insert. Exogenous 

H transcriptional elements and initiation codons can be of various origins, both 

m natural and synthetic. The efficiency of expression may be enhanced by the 

S inclusion of enhancers appropriate to the cell system in use (Scharf, D. et al, 

(1994) Results Probl Cell Differ., 20:125-62, (1994); Bittner et al., Methods in 
20 Enzymol 153:516-544, (1987)). 

In a further embodiment, the present invention relates to host cells 
containing the above-described constructs. The host cell can be a higher 
eukaryotic cell, such as a mammalian cell, or a lower eukaryotic cell, such as a 
yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. 
25 Introduction of the construct into the host cell can be effected by calcium 
phosphate transfection, DEAE-Dextran mediated transfection, or electroporation 
(Davis, L., Dibner, M., and Battey, I. (1986) Basic Methods in Molecular 
Biology). Cell-free translation systems can also be employed to produce 
polypeptides using RNAs derived from the DNA constructs of the present 
30 invention. 
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A host cell strain may be chosen for its ability to modulate the expression 
of the inserted sequences or to process the expressed protein in the desired 
fashion. Such modifications of the protein include, but are not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and 
5 acylation. Post-translational processing which cleaves a "pre-pro" form of the 
protein may also be important for correct insertion, folding and/or function. 
Different host cells such as CHO, HeLa, MDCK, 293, WI38, etc. have specific 
cellular machinery and characteristic mechanisms for such post-translational 
activities and may be chosen to ensure the correct modification and processing of 
10 the introduced, foreign protein. 

For long-term, high-yield production of recombinant proteins, stable 
expression is preferred. For example, cell lines which stably express variant 
product may be transformed using expression vectors which contain viral origins 
of replication or endogenous expression elements and a selectable marker gene. 
15 Following the introduction of the vector, cells may be allowed to grow for 1-2 
days in an enriched media before they are switched to selective media. The 
purpose of the selectable marker is to confer resistance to selection, and its 
presence allows growth and recovery of cells which successfully express the 
introduced sequences. Resistant clumps of stably transformed cells can be 
20 proliferated using tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell 
lines. These include, but are not limited to, the herpes simplex virus thymidine 
kinase (Wigler M., et al, Cell 11:223-32, (1977)) and adenine 
phosphoribosyltransferase (Lowy I., et al, Cell 22:817-23, (1980)) genes which 
25 can be employed in tk- or aprt- cells, respectively. Also, antimetabolite, 
antibiotic or herbicide resistance can be used as the basis for selection; for 
example, dhfr which confers resistance to methotrexate (Wigler M., et al, Proc. 
Natl. Acad. Sci. 77:3567-70, (1980)); npt, which confers resistance to the 
aminoglycosides neomycin and G-418 (Colbere-Garapin, F. et al., J. Mol. Biol., 
30 150:1-14, (1981)) and als or pat, which confer resistance to chlorsulfuron and 
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phosphinotricin acetyltransferase, respectively (Murry, supra). Additional 
selectable genes have been described, for example, trpB, which allows cells to 
utilize indole in place of tryptophan, or hisD, which allows cells to utilize histinol 
in place of histidine (Hartman S.C. and R:C. Mulligan, Proc. Natl Acad. Sci. 
5 85:8047-5 1, (1988)). The use of visible markers has gained popularity with such 
markers as anthocyanins, beta-glucuronidase and its substrate, GUS, and 
luciferase and its substrates, luciferin and ATP, being widely used not only to 
identify transformants, but also to quantify the amount of transient or stable 
protein expression attributable to a specific vector system (Rhodes, C.A. et. al, 
i o Methods Mol. Biol, 55:121-131,(1 995)). 

Host cells transformed with a nucleotide sequence encoding variant 
product may be cultured under conditions suitable for the expression and 
recovery of the encoded protein from cell culture. The product produced by a 
recombinant cell may be secreted or contained intracellularly depending on the 
15 sequence and/or the vector used. As will be understood by those of skill in the 
art, expression vectors containing nucleic acid sequences encoding variant 
product can be designed with signal sequences which direct secretion of variant 
product through a prokaryotic or eukaryotic cell membrane. 

The variant product may also be expressed as a recombinant protein with 
20 one or more additional polypeptide domains added to facilitate protein 
purification. Such purification facilitating domains include, but are not limited 
to, metal chelating peptides such as histidine-tryptophan modules that allow 
purification on immobilized metals, protein A domains that allow purification on 
immobilized immunoglobulin, and the domain utilized in the FLAGS 
25 extension/affinity purification system (Immunex Corp, Seattle, Wash.). The 
inclusion of a protease-cleavable polypeptide linker sequence between the 
purification domain and variant product is useful to facilitate purification. One 
such expression vector provides for expression of a fusion protein compromising 
a variant polypeptide fused to a polyhistidine region separated by an enterokinase 
30 cleavage site. The histidine residues facilitate purification on IMIAC 



(immobilized metal ion affinity chromatography, as described in Porath, et al. 9 
Protein Expression and Purification, 3:263-281, (1992)) while the enterokinase 
cleavage site provides a means for isolating variant polypeptide from the fusion 
protein. pGEX vectors (Promega, Madison, Wis.) may also be used to express 
foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In 
general, such fusion proteins are soluble and can easily be purified from lysed 
cells by adsorption to ligand-agarose beads (e.g., glutathione-agarose in the case 
of GST- fusions) followed by elution in the presence of free ligand. 

Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter is induced by 
appropriate means (e.g., temperature shift or chemical induction) and cells are 
cultured for an additional period. Cells are typically harvested by centrifugation, 
disrupted by physical or chemical means, and the resulting crude extract retained 
for further purification. Microbial cells employed in expression of proteins can 
be disrupted by any convenient method, including freeze-thaw cycling, 
sonication, mechanical disruption, or use of cell lysing agents, or other methods, 
which are well know to those skilled in the art. 

The variant products can be recovered and purified from recombinant cell 
cultures by any of a number of methods well known in the art, including 
ammonium sulfate or ethanol precipitation, acid extraction, anion or cation 
exchange chromatography, phosphocellulose chromatography, hydrophobic 
interaction chromatography, affinity chromatography, hydroxylapatite 
chromatography, and lectin chromatography. Protein refolding steps can be 
used, as necessary, in completing configuration of the mature protein. Finally, 
high performance liquid chromatography (HPLC) can be employed for final 
purification steps. 

C. Diagnostic applications utilizing nucleic acid sequences 

The nucleic acid sequences of the present invention may be used for a 
variety of diagnostic purposes. The nucleic acid sequences may be used to detect 
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and quantitate expression of the variant in patient's cells, e.g. biopsied tissues, by 
detecting the presence of mRNA coding for variant product. Alternatively, the 
assay may be used to detect soluble variant in the serum or blood. This assay 
typically involves obtaining total mRNA from the tissue or serum and contacting 
5 the mRNA with a nucleic acid probe. The probe is a nucleic acid molecule of at 
least 20 nucleotides, preferably 20-30 nucleotides, capable of specifically 
hybridizing with a sequence included within the sequence of a nucleic acid 
molecule encoding variant product under hybridizing conditions, detecting the 
presence of mRNA hybridized to the probe, and thereby detecting the expression 

10 of variant. This assay can be used to distinguish between absence, presence, and 
excess expression of variant product and to monitor levels of variant expression 
during therapeutic intervention. In addition, the assay may be used to compare 
the levels of the variant of the invention to the levels of the original sequence 
from which it has been varied or to levels of other variants, which comparison 

15 may have some physiological meaning. 

The invention also contemplates the use of the nucleic acid sequences as a 
diagnostic for diseases resulting from inherited defective variant sequences, or 
diseases in which the ratio of the amount of the original sequence from which the 
variant was varied to the novel variants of the invention is altered. These 

20 sequences can be detected by comparing the sequences of the defective (i.e., 
mutant) variant coding region with that of a normal coding region. Association 
of the sequence coding for mutant variant product with abnormal variant product 
activity may be verified. In addition, sequences encoding mutant variant 
products can be inserted into a suitable vector for expression in a functional assay 

25 system (e.g., colorimetric assay, complementation experiments in a variant 
protein deficient strain of HEK293 cells) as yet another means to verify or 
identify mutations. Once mutant genes have been identified, one can then screen 
populations of interest for carriers of the mutant gene. 

Individuals carrying mutations in the nucleic acid sequence of the present 

30 invention may be detected at the DNA level by a variety of techniques. Nucleic 
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acids used for diagnosis may be obtained from a patient's cells, including but not 
limited to such as from blood, urine, saliva, placenta, tissue biopsy and autopsy 
material. Genomic DNA may be used directly for detection or may be amplified 
enzymatically by using PCR (Saiki, et al., Nature 324:163-166, (1986)) prior to 
5 analysis. RNA or cDNA may also be used for the same purpose. As an example, 
PCR primers complementary to the nucleic acid of the present invention can be 
used to identify and analyze mutations in the gene of the present invention. 
Deletions and insertions can be detected by a change in size of the amplified 
product in comparison to the normal genotype. 
io Point mutations can be identified by hybridizing amplified DNA to 

radiolabeled RNA of the invention or alternatively, radiolabeled antisense DNA 
sequences of the invention. Sequence changes at specific locations may also be 
revealed by nuclease protection assays, such RNase and SI protection or the 
chemical cleavage method (e.g. Cotton, et alProc. Natl. Acad. Sci. USA, 
15 85:4397-4401, (1985)), or by differences in melting temperatures. "Molecular 
beacons" (Kostrikis L.G. et al., Science 279:1228-1229, (1998)), hairpin-shaped, 
single-stranded synthetic oligo- nucleotides containing probe sequences which 
are complementary to the nucleic acid of the present invention, may also be used 
to detect point mutations or other sequence changes as well as monitor 
20 expression levels of variant product. Such diagnostics would be particularly 
useful for prenatal testing. 

Another method for detecting mutations uses two DNA probes which are 
designed to hybridize to adjacent regions of a target, with abutting bases, where 
the region of known or suspected mutation(s) is at or near the abutting bases. 
25 The two probes may be joined at the abutting bases, e.g., in the presence of a 
ligase enzyme, but only if both probes are correctly base paired in the region of 
probe junction. The presence or absence of mutations is then detectable by the 
presence or absence of ligated probe. 

Also suitable for detecting mutations in the variant product coding 
30 sequence are oligonucleotide array methods based on sequencing by 
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hybridization (SBH), as described, for example, in U.S. Patent No. 5,547,839. In 
a typical method, the DNA target analyte is hybridized with an array of 
oligonucleotides formed on a microchip. The sequence of the target can then be 
"read 11 from the pattern of target binding to the array. 

5 

D. Gene mapping utilizing nucleic acid sequences 

The nucleic acid sequences of the present invention are also valuable for 
chromosome identification. The sequence is specifically targeted to and can 
hybridize with a particular location on an individual human chromosome. 
10 Moreover, there is a current need for identifying particular sites on the 
chromosome. Few chromosome marking reagents based on actual sequence data 
«1 (repeat polymorphisms) are presently available for marking chromosomal 

HF location. The mapping of DNAs to chromosomes according to the present 

SJ invention is an important first step in correlating those sequences with genes 

^ 15 associated with disease. 

[7 Briefly, sequences can be mapped to chromosomes by preparing PCR 

^ primers (preferably 20-30 bp) from the variant cDNA. Computer analysis of the 

Q 3 1 untranslated region is used to rapidly select primers that do not span more than 

O 

one exon in the genomic DNA, which would complicate the amplification 
20 process. These primers are then used for PCR screening of somatic cell hybrids 
containing individual human chromosomes. Only those hybrids containing the 
human gene corresponding to the primer will yield an amplified fragment. 

PCR mapping of somatic cell hybrids or using instead radiation hybrids 
are rapid procedures for assigning a particular DNA to a particular chromosome. 
25 Using the present invention with the same oligonucleotide primers, 
sublocalization can be achieved with panels of fragments from specific 
chromosomes or pools of large genomic clones in an analogous manner. Other 
mapping strategies that can similarly be used to map to its chromosome include 
in situ hybridization, prescreening with labeled flow-sorted chromosomes and 
30 preselection by hybridization to construct chromosome specific-cDNA libraries. 
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Fluorescence in situ hybridization (FISH) of a cDNA clone to a metaphase 
chromosomal spread can be used to provide a precise chromosomal location in 
one step. This technique can be used with cDNA as short as 50 or 60 bases. For a 
review of this technique, see Verma et al, Human Chromosomes: a Manual of 
5 Basic Techniques , (1988) Pergamon Press, New York. 

Once a sequence has been mapped to a precise chromosomal location, the 
physical position of the sequence on the chromosome can be correlated with 
genetic map data. Such data are found, for example, in the OMIM database 
(Center for Medical Genetics, Johns Hopkins University, Baltimore, MD and 

10 National Center for Biotechnology Information, National Library of Medicine, 
Bethesda, MD). The OMIM gene map presents the cytogenetic map location of 
disease genes and other expressed genes. The OMIM database provides 
information on diseases associated with the chromosomal location. Such 
associations include the results of linkage analysis mapped to this interval, and 

15 the correlation of translocations and other chromosomal aberrations in this area 
with the advent of polygenic diseases, such as cancer, in general and prostate 
cancer in particular. 

E. Therapeutic applications of nucleic acid sequences 

20 Nucleic acid sequences of the invention may also be used for therapeutic 

purposes. Turning first to the second aspect of the invention (i.e. inhibition of 
expression of variant), expression of variant product may be modulated through 
antisense technology, which controls gene expression through hybridization of 
complementary nucleic acid sequences, i.e. antisense DNA or RNA, to the 

25 control, 5' or regulatory regions of the gene encoding variant product. For 
example, the 5 f coding portion of the nucleic acid sequence sequence which 
codes for the product of the present invention is used to design an antisense 
oligonucleotide of from about 10 to 40 base pairs in length. Oligonucleotides 
derived from the transcription start site, e.g. between positions -10 and +10 from 

30 the start site, are preferred. An antisense DNA oligonucleotide is designed to be 
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complementary to a region of the nucleic acid sequence involved in transcription 
(Lee et aL, Nucl Acids, Res., 6:3073, (1979); Cooney et al., Science 241:456, 
(1988); and Dervan et al, Science 251:1360, (1991)), thereby preventing 
transcription and the production of the variant products. An antisense RNA 
5 oligonucleotide hybridizes to the mRNA in vivo and blocks translation of the 
mRNA molecule into the variant products (Okano J. Neurochem, 56:560, 
(1991)). The antisense constructs can be delivered to cells by procedures known 
in the art such that the antisense RNA or DNA may be expressed in vivo. The 
antisense may be antisense mRNA or DNA sequence capable of coding such 

10 antisense mRNA. The antisense mRNA or the DNA coding thereof can be 
complementary to the full sequence of nucleic acid sequences coding for the 
variant protein or to a fragment of such a sequence which is sufficient to inhibit 
production of a protein product. 

Turning now to the first aspect of the invention, i.e. expression of variant, 

15 expression of variant product may be increased by providing coding sequences 
for coding for said product under the control of suitable control elements ending 
its expression in the desired host. 

The nucleic acid sequences of the invention may be employed in 
combination with a suitable pharmaceutical carrier. Such compositions comprise 

20 a therapeutically effective amount of the compound, and a pharmaceutically 
acceptable carrier or excipient. Such a carrier includes but is not limited to saline, 
buffered saline, dextrose, water, glycerol, ethanol, and combinations thereof. The 
formulation should suit the mode of administration. 

The products of the invention as well as any activators and deactivators 

25 compounds (see below) which are polypeptides, may also be employed in 
accordance with the present invention by expression of such polypeptides in vivo, 
which is often referred to as "gene therapy." Cells from a patient may be 
engineered with a nucleic acid sequence (DNA or RNA) encoding a polypeptide 
ex vivo, with the engineered cells then being provided to a patient to be treated 

30 with the polypeptide. Such methods are well-known in the art. For example, cells 
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may be engineered by procedures known in the art by use of a retroviral particle 
containing RNA encoding a polypeptide of the present invention. 

Similarly, cells may be engineered in vivo for expression of a polypeptide 
in vivo by procedures known in the art. As known in the art, a producer cell for 
5 producing a retroviral particle containing RNA encoding the polypeptide of the 
present invention may be administered to a patient for engineering cells in vivo 
and expression of the polypeptide in vivo. These and other methods for 
administering a product of the present invention by such method should be 
apparent to those skilled in the art from the teachings of the present invention. 

10 For example, the expression vehicle for engineering cells may be other than a 
retrovirus, for example, an adenovirus which may be used to engineer cells in 
vivo after combination with a suitable delivery vehicle. 

Retroviruses from which the retroviral plasmid vectors mentioned above 
may be derived include, but are not limited to, Moloney Murine Leukemia Virus, 

15 spleen necrosis virus, retroviruses such as Rous Sarcoma Virus, Harvey Sarcoma 
Virus, avian leukosis virus, gibbon ape leukemia virus, human 
immunodeficiency virus, adenovirus, Myeloproliferative Sarcoma Virus, and 
mammary tumor virus. 

The retroviral plasmid vector is employed to transduce packaging cell 

20 lines to form producer cell lines. Examples of packaging cells which may be 
transfected include, but are not limited to, the PE501, PAS 17^ psi-2, psi-AM, 
PA 12, T19-14X, VT-19-17-H2, psi-CRE, psi-CRIP, GP+E-86, GP+envAml2 9 
and DAN cell lines. as described in Miller {Human Gene Therapy, Vol. 1, pg. 
5-14, (1990)). The vector may transduce the packaging cells through any means 

25 known in the art. Such means include, but are not limited to, electroporation, the 
use of liposomes, and CaP0 4 precipitation. In one alternative, the retroviral 
plasmid vector may be encapsulated into a liposome, or coupled to a lipid, and 
then administered to a host. 

The producer cell line generates infectious retroviral vector particles 

30 which include the nucleic acid sequence(s) encoding the polypeptides. Such 
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retroviral vector particles then may be employed, to transduce eukaryotic cells, 
either in vitro or in vivo. The transduced eukaryotic cells will express the nucleic 
acid sequence(s) encoding the polypeptide. Eukaryotic cells which may be 
transduced include, but are not limited to, embryonic stem cells, embryonic 
5 carcinoma cells, as well as hematopoietic stem cells, hepatocytes, fibroblasts, 
myoblasts, keratinocytes, endothelial cells, and bronchial epithelial cells. 

The genes introduced into cells may be placed under the control of 
inducible promoters, such as the radiation-inducible Egr-1 promoter, (Maceri, 
H.J., et al, Cancer Res., 56(19):431 1 (1996)), to stimulate variant production or 
10 antisense inhibition in response to radiation, eg., radiation therapy for treating 
tumors. 

Example III. Variant product 

The substantially purified variant product of the invention has been 

15 defined above as the product coded from the nucleic acid sequence of the 
invention. Preferably the amino acid sequence is an amino acid sequence having 
at least 90% identity to any one of the sequences coded by the nucleic acid 
sequence of SEQ ID NO:l to SEQ ID NO:48611 provided that the amino acid 
sequence is not identical to that of the original sequence from which it has been 

20 varied. The protein or polypeptide may be in mature and/or modified form, also 
as defined above. Also contemplated are protein fragments having at least 10 
contiguous amino acid residues, preferably at least 10-20 residues, derived from 
the variant product, as well as homologues as explained above. 

The sequence variations are preferably those that are considered 

25 conserved substitutions, as defined above. Thus, for example, a protein with a 
sequence having at least 90% sequence identity with any of the products coded 
by SEQ ID NO: 1 to SEQ ID NO:48611, preferably by utilizing conserved 
substitutions as defined above is also part of the invention, and provided that it is 
not identical to the original peptide from which it has been varied. The variant 

30 product may be (i) one in which one or more of the amino acid residues in a 
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sequence listed above are substituted with a conserved or non-conserved amino 
acid residue (preferably a conserved amino acid residue), or (ii) one in which one 
or more of the amino acid residues includes a substituent group, or (iii) one in 
which the variant product is fused with another compound, such as a compound 
5 to increase the half-life of the protein (for example, polyethylene glycol (PEG)), 
or a moiety which serves as targeting means to direct the protein to its target 
tissue or target cell population (such as an antibody), or (iv) one in which 
additional amino acids are fused to the variant product. Such fragments, variants 
and derivatives are deemed to be within the scope of those skilled in the art from 
10 the teachings herein. 

A. Preparation of variant product 

Recombinant methods for producing and isolating the variant product, and 
fragments of the protein are described above. 

15 In addition to recombinant production, fragments and portions of variant 

product may be produced by direct peptide synthesis using solid-phase 
techniques (cf. Stewart et aL, (1969) Solid-Phase Peptide Synthesis, WH 
Freeman Co, San Francisco; Merrifield J., J. Am. Chern. Soc, 85:2149-2154, 
(1963)). In vitro peptide synthesis may be performed using manual techniques or 

20 by automation. Automated synthesis may be achieved, for example, using 
Applied Biosystems 431 A Peptide Synthesizer (Perkin Elmer, Foster City, Calif.) 
in accordance with the instructions provided by the manufacturer. Fragments of 
variant product may be chemically synthesized separately and combined using 
chemical methods to produce the full length molecule. 

25 
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B. Therapeutic uses and compositions utilizing the variant product 

The variant product of the invention is generally useful in treating diseases 
and disorders which are characterized by a lower than normal level of variant 
expression, and or diseases which can be cured or ameliorated by raising the 
5 level of the variant product, even if the level is normal. 

Variant products or fragments may be administered by any of a number of 
routes and methods designed to provide a consistent and predictable 
concentration of compound at the target organ or tissue. The product-containing 
compositions may be administered alone or in combination with other agents, 
10 such as stabilizing compounds, and/or in combination with other pharmaceutical 
agents such as drugs or hormones. 

Variant product-containing compositions may be administered by a 
number of routes including, but not limited to oral, intravenous, intramuscular, 
transdermal, subcutaneous, topical, sublingual, or rectal means as well as by 
15 nasal application, variant product-containing compositions may also be 
administered via liposomes. Such administration routes and appropriate 
formulations are generally known to those of skill in the art. 

The product can be given via intravenous or intraperitoneal injection. 
Similarly, the product may be injected to other localized regions of the body. 
20 The product may also- be administered via nasal insufflation. Enteral 
administration is also possible. For such administration, the product should be 
formulated into an appropriate capsule or elixir for oral administration, or into a 
suppository for rectal administration. 

The foregoing exemplary administration modes will likely require that the 
25 product be formulated into an appropriate carrier, including ointments, gels, 
suppositories. Appropriate formulations are well known to persons skilled in the 
art. 

Dosage of the product will vary, depending upon the potency and 
therapeutic index of the particular polypeptide selected. 
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A therapeutic composition for use in the treatment method can include the 
product in a sterile injectable solution, the polypeptide in an oral delivery 
vehicle, the product in an aerosol suitable for nasal administration, or the product 
in a nebulized form, all prepared according to well known methods. Such 
5 compositions comprise a therapeutically effective amount of the compound, and 
a pharmaceutically acceptable carrier or excipient. Such a carrier includes but is 
not limited to saline, buffered saline, dextrose, water, glycerol, ethanol, and 
combinations thereof 

10 Example IV. Screening methods for activators and deactivators 

(inhibitors) 

The present invention also includes an assay for identifying molecules, 
such as synthetic drugs, antibodies, peptides, or other molecules, which have a 

15 modulating effect on the activity of the variant product, e.g. activators or 
deactivators of the variant product of the present invention. Such an assay 
comprises the steps of providing an variant product encoded by the nucleic acid 
sequences of the present invention, contacting the variant protein with one or 
more candidate molecules to determine the candidate molecules modulating 

20 effect on the activity of the variant product, and selecting from the molecules a 
candidate's molecule capable of modulating variant product physiological 
activity. 

The variant product, its catalytic or immunogenic fragments or 
oligopeptides thereof, can be used for screening therapeutic compounds in any of 
25 a variety of drug screening techniques. The fragment employed in such a test 
may be free in solution, affixed to a solid support, borne on a cell membrane or 
located intracellularly. The formation of binding complexes, between variant 
product and the agent being tested, may be measured. Alternatively, the activator 
or deactivator may work by serving as agonist or antagonist, respectively, of the 
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variant receptor, binding entity or target site, and their effect may be determined 
in connection with any of the above. 

Another technique for drug screening which may be used provides for 
high throughput screening of compounds having suitable binding affinity to the 

5 variant product is described in detail by Geysen in PCT Application WO 
84/03564, published on Sep. 13, 1984. In summary, large numbers of different 
small peptide test compounds are synthesized on a solid substrate, such as plastic 
pins or some other surface. The peptide test compounds are reacted with the full 
variant product or with fragments of variant product and washed. Bound variant 

10 product is then detected by methods well known in the art. Substantially purified 
variant product can also be coated directly onto plates for use in the 
aforementioned drug screening techniques. Alternatively, non-neutralizing 
antibodies can be used to capture the peptide and immobilize it on a solid 
support. 

15 Antibodies to the variant product, as described in Example VI below, may 

also be used in screening assays according to methods well known in the art. For 
example, a "sandwich" assay may be performed, in which an anti-variant 
antibody is affixed to a solid surface such as a microtiter plate and variant 
product is added. Such an assay can be used to capture compounds which bind 

20 to the variant product. Alternatively, such an assay may be used to measure the 
ability of compounds to influence with the binding of variant product to the 
variant receptor, and then select those compounds which effect the binding. 

Example VI. Anti-variant antibodies/distinguishing antibodies 
25 A. Synthesis 

In still another aspect of the invention, the purified variant product is used 
to produce anti-variant antibodies which have diagnostic and therapeutic uses 
related to the activity, distribution, and expression of the variant product. As 
indicated above, the antibodies may also be directed solely to amino acid 
30 sequences present in the variant but not present in the original sequence, or to 
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sequences present only in the original sequence but riot in the variant 
(distinguishing antibodies). 

Antibodies to the variant product or to the distinguishing sequence present 
only in the variant or only in the original sequence (the latter termed 
5 "distinguishing antibodies") may be generated by methods well known in the 
art. Such antibodies may include, but are not limited to, polyclonal, monoclonal, 
chimeric, humanized, single chain, Fab fragments and fragments produced by an 
Fab expression library. Antibodies, i.e., those which inhibit dimer formation, are 
especially preferred for therapeutic use. 

10 A fragment of the variant product for antibody induction does not require 

biological activity but have to feature immunological activity; however, the 
protein fragment or oligopeptide must be antigenic. Peptides used to induce 
specific antibodies may have an amino acid sequence consisting of at least five 
amino acids, preferably at least 10 amino acids of any sequences coded by the 

15 nucleic acid sequence of SEQ ID NO: 1 to SEQ ID NO:4861 1 or in distinguishing 
sequences present only in the variant or only in the original sequence as 
explained above. Preferably they should mimic a portion of the amino acid 
sequence of the natural protein and may contain the entire amino acid sequence 
of a small, naturally occurring molecule. Short stretches of variant protein amino 

20 acids may be fused with those of another protein such as keyhole limpet 
hemocyanin and antibody produced against the chimeric molecule. Procedures 
well known in the art can be used for the production of antibodies to variant 
product. 

For the production of antibodies, various hosts including goats, rabbits, 
25 rats, mice, etc may be immunized by injection with variant product or any 
portion, fragment or oligopeptide which retains immunogenic properties. 
Depending on the host species, various adjuvants may be used to increase 
immunological response. Such adjuvants include but are not limited to Freund ! s, 
mineral gels such as aluminum hydroxide, and surface active substances such as 
30 lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole 
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limpet hemocyanin, and dinitrophenol. BCG (bacilli Calmette-Guerin) and 
Corynebacterium parvum are potentially useful human adjuvants. 

Monoclonal antibodies to variant protein may be prepared using any 
technique which provides for the production of antibody molecules by 
5 continuous cell lines in culture. These include but are not limited to the 
hybridoma technique originally described by Koehler and Milstein {Nature 
256:495-497, (1975)), the human B-cell hybridoma technique (Kosbor et al, 
Immunol Today 4:72, (1983); Cote et al, Proc. Natl. Acad. Sci. 80:2026-2030, 
(1983)) and the EBV-hybridoma technique (Cole, et al, Mol Cell Biol 

10 62:109-120,(1984)). 

Techniques developed for the production of "chimeric antibodies", the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule 
with appropriate antigen specificity and biological activity can also be used 
(Morrison et al, Proc. Natl Acad. Set 81:6851-6855, (1984); Neuberger et al, 

15 Nature 312:604-608, (1984); Takeda et al, Nature 314:452-454, (1985)). 
Alternatively, techniques described for the production of single chain antibodies 
(U.S. Pat. No. 4,946,778) can be adapted to produce single-chain antibodies 
specific for the variant protein. 

Antibodies may also be produced by inducing in vivo production in the 

20 lymphocyte population or by screening recombinant immunoglobulin libraries or 
panels of highly specific binding reagents as disclosed in Orlandi et al (Proc. 
Natl Acad. ScL 86:3833-3837, 1989)), and Winter G and Milstein C, (Nature 
349:293-299,(1991)). 

Antibody fragments which contain specific binding sites for variant 

25 protein may also be generated. For example, such fragments include, but are not 
limited to, the F(ab')2 fragments which can be produced by pepsin digestion of 
the antibody molecule and the Fab fragments which can be generated by 
reducing the disulfide bridges of the F(ab')2 fragments. Alternatively, Fab 
expression libraries may be constructed to allow rapid and easy identification of 
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monoclonal Fab fragments with the desired specificity (Huse W.D. et al., Science 
256:1275-1281, (1989)). 

5 B. Diagnostic applications of antibodies 

A variety of protocols for competitive binding or immunoradiometric 
assays using either polyclonal or monoclonal antibodies with established 
specificities are well known in the art. Such immunoassays typically involve the 
formation of complexes between the variant product and its specific antibody and 
io the measurement of complex formation. A two-site, monoclonal-based 
immunoassay utilizing monoclonal antibodies reactive to two noninterfering 
epitopes on a specific variant product is preferred, but a competitive binding 
assay may also be employed. These assays are described in Maddox D.E., et al, 
{J. Exp. Med. 158:1211, (1983)). 
15 Antibodies which specifically bind variant product or distinguishing 

antibodies which bind to sequences which distinguish the variant from the 
original sequence (as explained above) are useful for the diagnosis of conditions 
or diseases characterized by expression of the novel variant of the invention 
(where normally it is not expressed) by over or under expression of variant as 
20 well as for detection of diseases in which the proportion between the amount of 
the variants of the invention and the original sequence from which it varied is 
altered. Alternatively, such antibodies may be used in assays to monitor patients 
being treated with variant product, its activators, or its deactivators. Diagnostic 
assays for variant protein include methods utilizing the antibody and a label to 
25 detect variant product in human body fluids or extracts of cells or tissues. The 
products and antibodies of the present invention may be used with or without 
modification. Frequently, the proteins and antibodies will be labeled by joining 
them, either covalently or noncovalently, with a reporter molecule. A wide 
variety of reporter molecules are known in the art. 
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A variety of protocols for measuring the variant product, using either 
polyclonal or monoclonal antibodies specific for the respective protein are 
known in the art. Examples include enzyme-linked immunosorbent assay 
(ELISA), radioimmunoassay (RIA), and fluorescent activated cell sorting 
5 (FACS). As noted above, a two-site, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-interfering epitopes on variant 
product is preferred, but a competitive binding assay may be employed. These 
assays are described, among other places, in Maddox, et al. (supra). Such 
protocols provide a basis for diagnosing altered or abnormal levels of variant 

10 product expression. Normal or standard values for variant product expression are 
established by combining body fluids or cell extracts taken from normal subjects, 
preferably human, with antibody to variant product under conditions suitable for 
complex formation which are well known in the art. The amount of standard 
complex formation may be quantified by various methods, preferably by 

15 photometric methods. Then, standard values obtained from normal samples may 
be compared with values obtained from samples from subjects potentially 
affected by disease. Deviation between standard and subject values establishes 
the presence of disease state. 

The antibody assays are useful to determine the level of variant product 

20 present in a body fluid sample, in order to determine whether it is being 
expressed at all, whether it is being overexpressed or underexpressed in the 
tissue, or as an indication of how variant levels of variable products are 
responding to drug treatment. 

25 C. Therapeutic uses of antibodies 

In addition to their diagnostic use the antibodies may have a therapeutical 
utility in blocking or decreasing the activity of the variant product in o pathological 
conditions where beneficial effect can be achieved by such a decrease. Again, 
distinguishing antibodies may be used to neutralize differentially either the 
30 variant or the original sequence as the case may be. 
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The antibody employed is preferably a humanized monoclonal antibody, 
or a human Mab produced by known globulin-gene library methods. The 
antibody is administered typically as a sterile solution by IV injection, although 
other parenteral routes may be suitable. Typically, the antibody is administered 
5 in an amount between about 1-15 mg/kg body weight of the subject. Treatment 
is continued, e.g., with dosing every 1-7 days, until a therapeutic improvement is 
seen. 

Although the invention has been described with reference to specific 
methods and embodiments, it is appreciated that various modifications and 
10 changes may be made without departing from the invention. 



